BIOTECHNOLOGY JgjH cs,-B 



RAW SEQUENCE LISTING 
- ERROR REPORT 





The Biotechnology Systems Branch of the Scientific and Technical Information 
Center (STIC) detected errors when processing the following computer readable 
form: 

Application Serial Number: 
Source: 

-Date Processed b>*STIC: 

'* % 

THE ATTACHED PRINTOUT EXPLAINS DETECTED ERRORS. ' 
PLEASE FORWARD THIS INFORMATION TO THE APPLICANT BY EITHER: 

1) INCLUDING A COPY OF THIS PRINTOUT IN YOUR NEXT COMMUNICATION TO THE 
APPLICANT, WITH A NOTICE TO COMPLY or, 

2) TELEPHONING APPLICANT AND FAXING A COPY OF THIS PRINTOUT, WITH A 
NOTICE TO COMPLY 

FOR CRF SUBMISSION AND PATENTIN SOFTWARE QUESTIONS, PLEASE CONTACT 
MARK SPENCER, TELEPHONE: 703-308-4212; FAX: 703-308-4221 
Effective J 2/13/03 : TELEPHONE: 571-272-2510; FAX: 571-273-0221 



TO REDbTF LRRORED SEQUENCE LISTINGS, PLEASE USE THE CHECKER 
VERSION 4.1 PROGRAM , ACCESSIBLE THROUGH THE U S PATENT AND 
TRADEMARK OFFICE WEBSITE. SEE BELOW FOR ADDRESS: 

http://www.uspto,govAvcb/ofllccs/pac/chcckcr/chkr4 lnote.htm 



Applicants submitting genetic sequence information electronically on diskette or CD-Rom slyjuki.be awarc4liat thcff 

a possibility that the disk/CD-Rom may have been affected by treatment given to all incoiniiig mail. 

Please consider using alternate methods of submission for the disk/CD-Rom or replacement disk/(fc-Rom. 

Any reply including a sequence listing in electronic form should NOT be sent to the 2023 1 zip code address for the 

United States Patent and Trademark Office, and instead sfiould be sent via the following to the indicated addresses: 

1 EFS-Bio (<http:/Avvvw.uspto.eov/cbc/cfs/downlonds/documents.htm> , EFS Submission 
User Manual - ePAVE) 

2 U.S. Postal Service: Commissioner for Patents, P.O. Box 1450, Alexandria, VA 223 13-1450 

3 Hand Carry directly to (EFFECTIVE 12/01/03): 

U.S. Patent and Trademark Office, Box Sequence, Customer Window, Lobby, Room 1B"03, Crystal Plaza Two, 
20 1 1 South Clark Place, Arlington, VA 22202 
4. Federal Express, United Parcel Service, or-othcr delivery service to: ITS. Patent and Trademark Office, 
Box Sequence, Room >B03-Mailroom, Crystal Plaza Two, 201 1 South Clark Plate, Arlington, VA 22202 



Revised 10/08/03 



Raw Sequence Listing Error Summary 



ERROR DETECTED SUGGESTED CORRECTION SERIAL NUMBER:, 

ATTN: NEW RULES CASES: PLEASE DISREGARD ENGLISH "ALPHA** HEADER WHICH WERE INSERTED BY PTO SOFTWARE 

* ■ 

|___ Wrapped Nucleics The number/text at the end of each line "wrapped" down to the next line. This may occur if your file 

Wrapped Aminos was retrieved in a word processor after creating it. Please adjust your right margin to .3; this will 
prevent "wrapping." 



12 



Invalid Line Length The rules require that a line not exceed 72 characters in length. This includes white spaces. 

Misaligned Amino The numbering under each 5 {h amino acid is misaligned. Do not use tab codes between numbers; 
Numbering use space characters, instead. 



Non-ASCII 



Variable Length 



Patentln 2.0 
" "bug" 



Skipped Sequences 
(OLD RULES) 



The submitted file was not saved in ASCII(DOS) text, as required by the Sequence Rules. Please 
ensure your subsequent submission is saved in ASCII text. 

Sequence(s) contain n's or Xaa's representing more than one i^sidue. Per Sequence Rules, 

each n or Xaa can only represent a single residue. Please present the maximum number of each 
residue having variable length and indicate in the <220>-<223> section that some may be missing. 

A "bug" in Patentln version 2.0 has caused the <220>-<223> section to be missing from amino acid 

sequences(s) . Normally, Patentln would automatically generate this section from the 

previously coded nucleic acid sequence. Please manually copy the relevant <220>-<223> section to 
the subsequent amino acid sequence. This applies to the mandatory <220>-<223> sections for 
Artificial or Unknown sequences. 



Sequence(s) 



missing. 



If intentional, please insert the following lines for each skipped sequence: 
(2) INFORMATION FOR SEQ ID NO:X: (insert SEQ ID NO where "X" is shown) 
(i) SEQUENCE CHARACTERISTICS: (Do not insert any subheadings under this heading) 

(xi) SEQUENCE DESCRIPTION.SEQ ID NO:X: (insert SEQ ID NO where "X" is shown) 
This sequence is intentionally skipped 

Please also adjust the "(ti) NUMBER OF SEQUENCES:" response to include the skipped sequences. 



Skipped Sequences Sequcnce(s) 
(NEW RULES) 



missing. If intentional, please insert the following lines for each skipped sequence. 




Use of n/£ or Xaa's 
(NEW/RULES) 



Invalid <213> 
Response 



Use of<220> 



<2 1 0> sequence id number 
<400> sequence id number 
000 

Use of n's and/or Xaa's have been detected in the Sequence Listing. 

Per 1 .823 of Sequence Rules, use of <220>-<223> is MANDATORY if n's or Xaa's are present. 

In <220> to <223> section, please explain location of n or Xaa, and which residue n or Xaa represents. 

Per 1.823 of Sequence Rules, the only valid <2I3> responses are: Unknown, Artificial Sequence, or 
scientific name (Genus/spccics). <220>-<223> section is required when <2I3> response is Unknown or 
is Artificial Sequence 



Scquencc(s) 



missing the <220> "Feature" and associated numeric identifiers and responses. 



^Patentln 2.0 
"bug" 



Use of <220> to <223> is MANDATORY if <213> "Organism" responsris "Artificial Sequence" or 

"Unknown." Please explain source of genetic material in <?20> to <223> section. 

(See "Federal Register/' OtfTO 1/1 998, Vol. 63, No. 104, pp. 29634^32) (Sec, 1.823 of Sequence Rules) 

Please do not use "Copy to Disk" function of Patentln version 2.0. This causes a corrupted file, 
resulting in missing mandatory numeric identifiers and responses (as indicated on raw sequence 
listing). Instead, please use "File Manager" or any other manual means to copy file to floppy disk. 



1 3 Misuse of n/Xaa "n" can only represent a single nucleotide ; "Xaa" can only represent a single amino acid 



AMC - Biotechnology Systems Branch - 09/09/2003 
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IFWO 



RAW SEQUENCE LISTING DATE: 04/02/2004 

PATENT APPLICATION: US/10/809,816 TIME: 09:11:36 

Input Set : A:\SEQLIST_1507.TXT 

Output Set: N:\CRF4\04022004\J809816.raw 

5 <110> APPLICANT: LI, Shyr-Jiann et al. 

7 <120> TITLE OF INVENTION: ISOLATED MONKEY CATHEPSIN S PROTEINS , 

8 NUCLEIC ACID MOLECULES ENCODING MONKEY CATHEPSIN S PROTEINS, 

9 AND USES THEREOF 

11 <130> FILE REFERENCE: CL001507 
C--> 13 <140> CURRENT APPLICATION NUMBER: US/10/809 , 816 
C — > 13 <141> CURRENT FILING DATE: 2004-03-26 

13 <160> NUMBER OF SEQ ID NOS : 11 

15 <170> SOFTWARE: FastSEQ for Windows Version 4.0 



ERRORED SEQUENCES 

378 <210> SEQ ID NO: 

379 <211> LENGTH: 24 

380 <212> TYPE: DNA 

381 <213> ORGANISM 

383 <400> SEQUENCE 

384 gggccctgga agcacagctg aagc 
E — > 385i^ 
E— > 387 1 




Does Not Comply 
Corrected Diskette Needed 




24 



^ ^ /a* 

6A *rro*- **** 




file://C:\CRF4\Outhold\VsrJ8098 1 6.htm 
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VERIFICATION SUMMARY DATE: 04/02/2004 

PATENT APPLICATION: US/10/809 , 816 TIME: 09:11:38 

Input Set : A:\SEQLIST_1507.TXT 

Output Set: N:\CRF4\04022004\J809816.raw 

L:13 M:270 C: Current Application Number differs, Replaced Current Application No 
L:13 M:271 C: Current Filing Date differs, Replaced Current Filing Date/ 
L:385 M:254 E^JSIo. of Bases conflict, this line has no nucleotides. 1 — 
M:254 Repeated in SeqNo-11 



file://C:\CRF4\Outhold\VsrJ8098 1 6.htm 



4/2/04 




<210> 7 
<211> 322 
<212> PRT 




<213y consensus 



<400> 7 

Met Lys Leu Val 

1 

His Lys Asp Pro 
20 

Gly Lys Gin Tyr 
35 

Trp Glu Lys Asn 
50 

Met Gly Met His 
65 

Thr Ser Glu Glu 

Gin Trp Gin Arg 
100 

Ser Val Asp Trp 
115 

Gly Ser Cys Gly 
130 

Ala Gin Leu Lys 
145 

Asn Leu Val Asp 

Gly Gly Phe Met 
180 

Ser Asp Ala Ser 
195 

Asp Ser Lys Tyr 
210 

Tyr Gly Arg Glu 
225 

Val Ser Val Gly 

Ser Gly Val Tyr 
260 

Val Leu Val Val 
275 

Lys Asn Ser Trp 
290 

Arg Asn Lys Gly 
305 

Glu He 



Cys Val 
5 

Thr Leu 

Lys Glu 

Leu Lys 

Ser Tyr 

70 
Val Met 
85 

Asn He 

Arg Glu 

Ala Cys 

Leu Lys 
150 
Cys Ser 
165 

Thr Ala 

Tyr Pro 

Arg Ala 

Asp Val 
230 
Val Asp 
245 

Tyr Glu 

Gly Tyr 

Gly Asn 

Asn His 
310 



Leu Val Cys 

Asp His His 
25 

Lys Asn Glu 
40 

Phe Val Met 
55 

Asp Leu Gly 

Ser Leu Met 

Thr Tyr Lys 
105 

Lys Gly Cys 

• 120 
Trp Ala Phe 
135 

Thr Gly Lys 

Thr Glu Lys 

Phe Gin Tyr 
185 

Tyr Lys Ala 

200 
Ala Thr Cys 
215 

Leu Lys Glu 

Ala Ser His 

Pro Ser Cys 
265 

Gly Leu Asn 

280 
Phe Gly Glu 

295 

Cys Gly He 



Ser Ser 
10 

Trp Leu 

Glu Ala 

Leu His 

Met Asn 
75 

Ser Ser 
90 

Ser Asn 

Val Thr 

Ser Ala 

Leu Val 
155 
Tyr Gly 
17 0 

He lie 

Met Asp 

Ser Lys 

Ala Val 
235 
Pro Ser 
250 

Thr Gin 

Gly Lys 

Gin Gly 

Ala Ser 
315 



Ala Val Ala 
Trp 
Val 



Asn 

60 

His 

Leu 

Asn 

Glu 

Val 
140 
Ser 

Asn 

Asp 

Gin 

Tyr' 
220 
Ala 

Phe 

Asn 

Glu 

Tyr 
300 
Tyr 



Lys Lys 
30 

Arg Arg 
45 

Leu Glu 



Leu Gly 

Arg Val 

Gin Leu 
110 
Val Lys 
125 

Gly Ala 

Leu Ser 

Lys Gly 

Asn Gly 
190 
Lys Cys 
205 

Thr Glu 

Asn Lys 

Phe Leu 

Val Asn 
270 
Tyr Trp 
285 

He Arg 
Pro Ser 



Gin Leu 
15 

Thr Tyr 

Leu He 

His Ser 

Asp Met 

80 
Pro Ser 
95 

Pro Asp 

Tyr Gin 

Leu Glu 

Ala Gin 
160 
Cys Asn 
175 

He Asp 

Gin Tyr 

Leu Pro 

Gly Pro 
240 
Tyr Arg 
255 - 
His Gly 

Leu Val 

Met Ala 

Tyr Pro 
320 



<210> 8 
<211> 31 
<212> 

<213^ primer 



<400> 8 

ccggaattct tgcataaaga tcccaccctg g 




31 



^llt ^ of efrore shown exi st throughout 
the Sequence Listing. Please checlcsubseq?ent 
sequences forsimiia7elTorer~ 



